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Abstract 

We derive the asymptotic distribution of the supremum distance 
of the deconvolution kernel density estimator to its expectation for 
certain supersmooth deconvolution problems. It turns out that the 
asymptotics are essentially different from the corresponding results for 
ordinary smooth deconvolution. 
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1 Introduction and results 



Consider the classical deconvolution problem: let X\, . . . , X n be i.i.d. obser- 
vations, where Xi = Yj + Zi and Yi and Zi are independent. Assume that 
the unobservable Yi have distribution function F and density /, and that 
the random variables Zi have a known density k. Note that the density g of 
Xi is equal to the convolution of / and k. The nonparametric deconvolution 
problem is the problem of estimating / or F from the observations Xi . Thus 
we want to recover the distribution of Yi using the contaminated measure- 
ments Xi. Additional information on measurement erro r models and many 
practical examples can be found in ICarroll et all l|200d ). 

A popular density estimato r for this problem is the dec onvolution kernel 

densit y estimator introduced in I Carroll and (jl988l ) and lStefanski and Carrolll 
(Il990h . This estimator is defined as 
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Here w denotes a kernel function, h > is a bandwidth, (j) emp is the empirical 
characteristic function of the sample defined by 4> e mp(t) = (V n ) ]Cj=i e%tXi > 
and W and 0^ denote the characteristic functions of w and k, respectively. 
Note that ([T]) is not a standard kernel density estimator, because the ker- 
nel function Vh depe nds on the bandwidth h. For an introduction to the 
estimator (pQ) see e.g. Wand and Jones! ( 19951 ). 

The rate of decay to zero at minus and plus infinity of the modulus of the 
characteristic function <fik, and consequently the smoothness of k, is crucial 
to the asymptotic behaviour of ([T]). Two cases have been distinguished, 
the ordinary smooth case, where \<j>k\ decays algebraically to zero, and the 
supersmooth case, where it decreases exponentially. The asymptotics in the 
ordinary smooth case are essentially the sa me a s for a kernel estimator of a 
hig her order derivative of a density, see e.g. H (|l99lh . lFan and Liul (|l997h 
and van Es and Koddl99fih. The as ymptotics in the supersmoot h case have 
been studied e.g. in|FanJ {[l99l|) and Ivan Es and Uhl (j2004l . bood ). 

Notice that the above papers study local properties of the estimator (P), 
i.e. its pointwise behaviour. We, on the other hand, will focus on the asymp- 
totic behaviour of the supremum distance of the estimator to its expectation, 
which provides a global measure of its performance. Accordingly, define 



M n = SUp \f nh (x) 
0<x<l 



V[fnh(x)}\- 



(2) 



The fact that the supermum is taken over [0, 1] is not a restriction of gen- 
erality and is for convenience only. One could have considered any interval 
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[a, b\. An alternative here is to c onsider the integrated square d error of the 
estimator / n ^. This was done in Holzmann and Bovsen ( 20061 ). 

The asymptotic distribution of the supremum distance similar to (|2]), 
namely sup xe[0A ](g(x)y 1 ^ 2 \g nh (x) - E [g nh (x)]\, for an ordinary kernel den- 
sity estimator g n h in the direct density estimati o n sett ing (i.e. in the error- 
free case) was derived in iBickel and Rosenblattl (Il973l ). Owing in a certain 
sense to the similarity of the asymptotics in the ordinary smooth deconvolu- 
tion problem to that in the dir ect density estimation problem, qualitatively 
similar results were obtained in lBissantz et al.l (|2007l ) in the ordinary smooth 
deconvolution problem for the supremum distance sup xe [Q ^ {g{x))~ l l 2 \f n h{x)- 
E [/ n /i(aO]|- Normalisation with ^/ g(x) is explainable by the fact that the ex- 
pression for the asymptotic variance in the asymptotic normality theorem 
for the estimato r f n h ( x) in the ordinary smooth deconvolution problem in- 



volves g(x). see iFanl (119911). No direct extension of the methods used in 
Bickel and Rosenblatt! ( 19731 ) to the supersmooth deconvolution problem is 
possible and derivation of the asymptotic distribution of ([2]) requires a dif- 
ferent approach. This is precisely the task of the present paper. Notice that 
in ([2]) we do not have to normalise with y g(x), because the asymptotic vari- 
ance in the asymptotic normality theorem for this case does not depend on 
a, but only on the error density k (in some global way) , see Ivan Es and Uh 
(|2005h . 

We now state the conditions on the density k and kernel w, which will be 
used throughout the paper. The condition on k which defines supersmooth 
deconvolution is given in Condition [TJ 



Condition 1. Assume that 



<t>k{t) = C\t\ Xo exp \-\t\ x /fi] (1 + ofltr 1 )) 



(3) 



as \t\ — > oo, for a constant < A < 2 and some constants > 0, Ao £ K and 
CeR. Furthermore, let <j) k (t) ^ for all t £ E. 

Condition [1] is stronger than the usual condi tion on k in supersmooth 
deconvolution given e.g. in Ivan Es and Uhl (l2005h . where the term o(\t\ ) 
is not present and one just has the asymptotic equivalence. 

Condition 2. Let (fr w be real-valued, symmetric and have support [—1,1]. 
Let 0u>(O) = 1, and assume 4> W {1 — t) = At a + o(t a ) as t j for some 
constants A and a > 0. 



van 



Es and Uhl (|2005h . 



For examples of such kernels see for instance 

The next theorem establishes the asymptotic distribution of M n , which 
could prove useful for the construction of uniform confidence bands around 
/. Since it will appear repeatedly in the paper, we will write £(/i) for 
exp(l/( / L i / l A )). 
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Theorem 1. Assume Condition [7] for A = 2 and Condition and let 
E [Xj] < oo. Let V denote a positive random variable with a Rayleigh 
distribution with density fv{x) = xexp[— x 2 /2]Ir x>0 y Then, as n — > oo and 

where T denotes the gamma function. 

By assuming A = 2 we restrict ourselves to deconvolution problems for 
error distributions with characteristic functions that have an exponential 
tail like the characteristic function of a normal density. The most important 
case covered by this condition is standard normal deconvolution, where A = 
2,Ao = 0, fj, = 2 and C = 1. The condition A = 2 seems to be essential in 
the proof of Lemma EJ specifically in ([8]), where we prove a condition for 
tightness of the remainder process b£\ Whether it can be relaxed by other 
approaches, avoiding tightness, remains open. 

The rate of convergence in Theorem [1] once again reflects the difficulty of 
the supersmooth deconvolution problem compared to the ordinary smooth 
deconvolution. Furth ermore, unlike in ordinary smooth deconvolution, see 
Bissantz et al. (2007), in order to obtain the asymptotic distribution of M n , 



we do not have to subtract a drift term. This also has a parallel when con- 
sid ering the asymptotics of the IS E[/„a1 in the supersmooth deconvolution, 



sec 



Holzmann and Boysen (2006) for additional details. Notice also that 



unl ike the direct density estima tion or the ordinary smoot h deconvolution, 



sec 



Bickel and Rosenblatt (|l973l ) and Bissantz et al. (j2007l ) , the limit distri- 



bution in (|3]) is not Gumbel, which confirms the conjecture in lBissantz et al.1 
(|2007h for the case A = 2. 



2 Proof of Theorem [T] 



The p roof of Theorem[T]is based on a decomposition of f n h( x ) in lvan Es and Uh 



which is the basis of the proof of their asymptotic normality theorem. 



We have 

1 r l 1 " 

fnh(x) = h x °- x / w (s)s- Xo exp( S A /(M A ))ds- V cos 



h 



(5) 
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where R^\x) = (1/n) £? =1 R$(x),l = 1, 2, 3, and 



I f ffXj-x\\ fXj-x 



cos 



x ^(s)s" Ao exp(s A /(^/i A ))ds 



1 1 /l«l\-*o 



(B) °exp(| S | A /(M A )))^- 



fc (s//i) (7U 

We will write R$ ,1 = 1, 2, 3 for the stochastic processes R$ = (B$ (x)) x e[o,i\ • 
Notice that these processes belong to the space C[0, 1]. 

Now the rough idea is to derive the asymptotic distribution of the supre- 
mum of the first summand in ([5]) minus its expectation and to show that 
the remainder terms are negligible. Define the process U n as U n yxj — 
n -i/2 Y^j = \ U n ,j(x), where U n j(x) = cos((X,- —x)/h) — E [cos((Xj —x)/h)\. 
Note that this is a process with expectation equal to zero at every x. Write 

S n = sup \U n (x)\. 

0<x<l 

Lemma 1. Under the conditions of Theorem[I\ we have, as n — > oo and 
h^O, 

S n ° sup \W(x)\, 

0<x<2tt 

where W is a zero mean Gaussian process on [0, 2tt] with covariance function 
Cav(W(xi), W{x 2 )) = (1/2) cos(xi - x 2 ). 

Proof. Replacing x by yh, by the periodicity of the cosine function we have 
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for h < (2-k)- 1 that 



sup \U n (x) 

0<x<l 
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.7 = 1 



y)-E[cos(y,-y)]) 



—f= y( cos ( Y j - y) - E [cos 



(iS--y)D 



= sup |W n (y)|, 

0<y<2w 

where Yj = (Xj/h)mod 2ir and the process W n on [0, 2ir] is given by W n (y) 
n- 1 ' 2 T,j=i(W nd (y) - E [W nJ (y)]) with W nJ (y) = cos(Yj - y). ' 



By Lemma 6 of Ivan Es and Uhl tod ) we know that Y,- ^ Un(0, 2vr) 
as /i — > for each j, where Un(0, 2ir) denotes the uniform distribution on 
[0, 2tt]. Hence by the dominated convergence theorem we get that 



Cov y cos \ Yj -yij, cos [Yj - y 2 

1 f 2lT 1 
-> — / cos(u - yt) cos(u - y 2 )du = - cos(yi 

27T Jo 2 



It follows that we have to study the convergence of the process W n {x) — 
E [W n (x)] which belongs t o C[0, 2ir}. Accord ing to Prohorov's theorem and in 
particular Theorem 8.1 of Billingslev ( 19681 ). it suffices to show weak conver- 
gence of the finite dimensional distributions and tightness of the sequence. 
By the multivariate central limit theore m in the triangula r array scheme or 
Cramer- Wold device, see Theorem 7.7 in Billingslev ( 19681 ). the finite dimen- 
sional distributions of the process W n converge to multivariate normal distri- 
butions with covariances given by Cov(W(yi), W(y 2 )) = (1/2) c osfai — y 2 ^ 



To pr ove tightness, we will verify conditions of Theorem 12.3 of iBillingsley 
(1968). First of all, notice that the sequence W n (0) is tight, because the 
asymptotic normality of W n (0) follows by a univariate Lyapu nov central 
limit theorem in a trinagular array scheme, see Theorem 7.3 in IBillingsley 
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. , 2Yj - y 2 - yi \ . fyi- y 2 
2 sm — sin 1 



( 19681 ). Furthermore, for an arbitrary positive r], 

P(\W n {y 2 ) - E [W n {y 2 )\ - (W n ( yi ) - E \W n (vi)])\ > v) 
< 1 Y^[W n>j {y 2 ) - W nJ ( yi )] < \ E [(W nJ (y 2 ) - W n ^ yi )f] 

<^(V2-yi) 2 , 
which follows from the fact that 
I cos(Yj- - y 2 ) - cos(Yj- - y x )\ 

< \yi - 2/2 1 - 

Here we used the inequality | since) < \x\. Therefore W n converges weakly 
to a zero mean Gaussian process W on [0, 2ir] with covariance function 
Cov(W(yi),W(y 2 )) = (1/ 2) cosfai —} ! •?)■ By the continuous mapping the- 
orem, see Theorem 5.1 in Billingslev ( 19681 ). the supremum of \W n \ then 
converges weakly to the supremum of the absolute value of the limit pro- 
cess, which proves the lemma. □ 

Lemma 2. With V as in Theorem^ we have 

sup \W(x)\ = \V2V. (6) 

Proof. Let N\ and N 2 denote two independent standard normal random 
variables and let us define the process W by W = {W{x)) x( z\p^ , where 

W(x) = i\/2(iVi cosx + iV2sinx). 

Since the covariance function Cov(W(xi),W(x 2 )) of the process W, given 
by (l/2)cos(xi — x 2 ), equals Coy(W(x\),W(x 2 )) by 

Cov (^-V2(N\ cosxi + N 2 sinxi), -\f2(N\ cosx 2 + N 2 sinx2, 

1 / x 1 x 

= - (cos x\ cos x 2 + sm x\ sm x 2 ) = - cos(xi — x 2 ) 

it follows that W = W. 
Next write 

-\/2(iVi cosx + N 2 SIM 
= IvV^ff cos x -f- ^ 



- V2\/ iVf + iVf (cos £ cos x + sin £ sin x) 



smi 



iv^V? + JV 2 2 coB(z-0, (7) 
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se- 



for a £ such that cos£ = N X /^N'(+N% and sin£ = N 2 / N'( +~N% . T he 
supremum of the absolute value of © is equal to (1/2)^^1+^2 = 
(l/2)\/2V, where V has a Rayleigh distribution. This entails (|6|). □ 

Lemma 3. Ze£ a n = ^/nh^^^"^^^^^))^ 1 denote the normalising 
quence in Theorem^ For I = 1,2,3 we have 

oo and h — ► 0. i^ere denotes the zero process on [0, 1]. 



as n 



Proof. To prove the le mma, we w il l app ly Prohorov's theorem, and in par- 
ticular Theorem 8.1 of Billingsleyl ( 19681 ). Firstly, notice that for a fixed x 



the remainder terms a n (R^}(x) — E \Rn' (x)]) vanish in probability, which 
was proved in Ivan Es and Uhl (j2005h . This implies that the finite dimen- 
sional vectors of the processes a n (R®-E[R®}) also converge in probability 
to null vectors. To establish tigh tness, we will again verify conditions of 
Theorem 12.3 of Billingslev ( 19681 ) . Notice that when x = 0, the sequence 
0"n(Rn . (0) — E [Rn , (0)]) is tight, since it converges to zero in probability. 
Furthermore, for an arbitrary positive r] we have 



(0, 



p(a n \R^(x 2 ) - E [R n 1 \x 2 )} - (r£\ Xi ) - E [R^( Xl )])\ > r, 
<4var[ J R( 1 )(x 2 )- J Ri 1 )( a;i )] 



= ^V a r[R^ 1 (x 2 )-R n \> l (x 1 ) 
<ilE[(RM(x 2 )-RU( Xl )f] 



ci, 



1 1 1 



rj z C z 7r z n 
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h 2 



<Pw(s)s~ x ° exp(s x /(nh x ))ds 



K 2 - 2 ^ 2 - 2 -^- 2) -\x 2 -x,f 
-I , 

1 - s)(f) w (s)s- Xo exp{s x /{nh x ))ds 



O 



I^2(A () -2)-2 + 2(2+a)A (c(/l)) 2 a 2Nj_ ( 



[X 2 - X\. 



0(h^)±(x 2 - Xl ) 
0(l)^(x 2 - Xl ) 2 . 



(8) 
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where K is some constant. Here we used Lemma 5 of lvan Es and Uhl (120051 ). 
which states that 



- Ao (l - sf(f) w {s)^{s x /{nh x ))ds 



M j h ) C(W + /J+i), 



(9) 



and the fact that for < s < 1 and < x\ < x 2 < 1 we have 



cos s 



Xj - x 2 



X2 /■! Q2 



COS 



Xj - X2 

h 



'Xj — xi\\ //Xj—xi 
cos l s l — — I I + COS ' 1 



r / (Xj -u 
ouov y. 



cos I — 3 — I \dudv 



Xl Js „ m „ . . . h J J \ h 

x * d [ [l s ' m H^)) +v (^) cos (^P)} dudv 

f-X2 rl 



r-x 2 rl i 

< J J (\Xj\ + 1 + h) dudv 

< — (1^1 + 1 + ^(1 -s)|si-x 2 |. 
Hence the process a n (R^ -E[r£ ] }) is tight. 

(2) (2) 

In order to prove tightness of the process a n (R y n ; -E[R y n'}), note that, 
as above, for positive r\ 



P{a n \R^(x 2 ) - E [R®(x 2 )] - (RW( Xl ) - E [Rg\ Xl )})\ > r, 
1 1 



<^E[(R%( X2 )-R); t [( Xl )Y 



7Jh) ds ) 2{x2 - Xlf 



a: 



4ZTL 



1 1 



rj 2 47r 2 /i 4 n 
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<4a!^^-(2e) 2 e 2 



n 



< 4a 



; 4vr 2 n 4 
n TT 2 h 4 n 



<Pk(s/h) 
1 



ds) (x 2 - x\Y 



\ 2 1 

ds —o{x 2 - x\Y 



s/h) ) rj 2 



< 4ai 



2 1 



C 2 TT 2 



1 \2 1 

-e<f<e 4>k{s/h)J T] 
\4-2Aq 



n 



{e/hf- 2 ^ exp(2(e/n) A /^Wx 2 - 



1 



o(lW(x 2 - zi) , 
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where K is some constant and where we used the fact that for < s < 1, 



exp is 



Xj - x 2 



exp is 



Xj — x\ 



< 



+ 



cos I s 
X 



Xj - X 2 

h 



COS s 



Xj — x\ 



sin 



Xj — x\ 
h 



2s, 



< "H^l - ^2 1, (10) 



which follows by converting the differences of sines and cosines into products 
and using the fact that |sinx| < \x\. Consequently, the process a n (Rn — 
E[i?i 2) ]) is tight. 

(3) (3) 

To prove tightness of the process a n (Rn — E [R n ]), we first introduce 
the function u, given by 



C\y\ Xo exp{-\y\ X /v) , 
<V) = ffi) 



id 



By Condition [T] this function is bounded on M\(— 5, 5), where 5 is an arbitrary 
positive number. Moreover, by © the function xu{x) is also bounded and 
both functions vanish at plus and minus infinity. It follows that (s/h)u(s/h) 
is bounded and tends to zero for all fixed s with Isl > e as h — » 0. 



(3) 

Using the function u, rewrite R n j{x) as follows 



2nh 



+ 



exp is 



h 



4>w(s) 



<p k {s/h) C\h 



-Ao 



exp(|s| A /(/i/i A )))ds 



2nh 



+ 



exp I is I 



(X 



x 



V h 

A //..A 



4>w(s) 



X C\h) e M\s\y{^h A ))u{s/h)ds. 
Next note that, as above, for positive r\ we have by ()10|) that 
p(a n \R^(x 2 )-E[R^(x 2 )] - (R^(x 1 )-E[R^(x 1 )])\ > r, 



at 1 



<^lE[(R^(x 2 )-R^[( Xl )r) 



? (3), 



< 



a, 



1 1 



rj 2 Ait 2 h 2 n 



+ 



-i 



4>w(s) 
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C\ h 



-Ao 



exp(\s\ X /(flh X ))^U^)cls) (.ro - .rjj- 



1 



o(l)-^(x 2 - £i) z 
?7 Z 
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(3) (3) 

and hence a n {R n — E [R n ]) is tight. By Prohorov's theorem each of the 
three processes now converges weakly to the zero process. Since the conver- 
gence in distribution to a constant entails convergence to the same constant 
in probability, this concludes the proof of the lemma. □ 

Finally, we combine the obtained results to prove Theorem [TJ 

Proof of Theorem^ The proof is immediate from Lemmas [THS] just proved, 
the fact that by © 

an^h^- 1 [ (p w (s)s- Xo eMs X /^h x ))ds^= 
ttC J t V n 

~-^(t) r(« + i), 



ttC U, 

and Theorems 4.1 and 5.1 of Billingslevl ( 19681 ). □ 
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